Skip to content

WIP: Route inline reasoning streams#3446

Merged
Hmbown merged 3 commits into
mainfrom
issue/3222-inline-reasoning-tags
Jun 23, 2026
Merged

WIP: Route inline reasoning streams#3446
Hmbown merged 3 commits into
mainfrom
issue/3222-inline-reasoning-tags

Conversation

@Hmbown

@Hmbown Hmbown commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Summary

  • add provider-scoped reasoning_stream_style config for separate_field, inline_tags, and none stream behavior
  • route inline <think>...</think> content into Thinking blocks, including tags split across SSE chunks
  • document the provider-table key for OpenAI-compatible gateways

Tests

  • cargo fmt --all -- --check
  • cargo test -p codewhale-tui --bin codewhale-tui --locked reasoning_style
  • cargo test -p codewhale-tui --bin codewhale-tui --locked provider_config_loads_reasoning_stream_style
  • cargo test -p codewhale-tui --bin codewhale-tui --locked stream_decoder_tests
  • cargo test -p codewhale-tui --bin codewhale-tui --locked stream_reasoning (0 matching tests)
  • git diff --check

Refs #3222.

Reported with detailed route-shape examples by @buko.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new configuration option, reasoning_stream_style, which allows users to customize how streaming reasoning content is handled and separated from the main answer text. It supports separate fields, inline tags (e.g., <think>...</think>), or treating reasoning content as standard text. The changes include updates to client initialization, SSE chunk parsing, configuration loading, documentation, and comprehensive unit tests. The reviewer suggested using fully qualified paths for SystemTime and UNIX_EPOCH in the new configuration test to avoid potential compilation issues if they are not imported.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +4122 to +4125
let nanos = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_nanos();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential compilation errors if SystemTime and UNIX_EPOCH are not imported in this test module, it is safer to use their fully qualified paths from std::time.

Suggested change
let nanos = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_nanos();
let nanos = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap()
.as_nanos();

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@Hmbown Hmbown marked this pull request as ready for review June 23, 2026 05:56

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@Hmbown Hmbown force-pushed the issue/3222-inline-reasoning-tags branch from 53ef812 to 3636629 Compare June 23, 2026 08:59
@claude

claude Bot commented Jun 23, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 5m 19s —— View job


PR Review: Route inline reasoning streams

  • Gather context and understand request
  • Read diff and changed files
  • Review correctness and logic
  • Check provider/model/route safety
  • Check reuse and simplification
  • Check test coverage
  • Post findings

Summary

The routing logic is sound and the state machine for split-chunk tag parsing is correct. A few issues worth addressing before this ships, in priority order:


1. Silent fallback on unrecognized config value — parse_reasoning_stream_style (chat.rs:2165–2173)

fn parse_reasoning_stream_style(value: &str) -> Option<ReasoningStreamStyle> {
    match value.trim().to_ascii_lowercase().replace('-', "_").as_str() {
        ...
        _ => None,
    }
}

reasoning_stream_style_for_stream calls this and, on None, falls through to the provider/model heuristic. A user who typos "inline_tag" (singular) gets silently ignored — their configured gateway behaviour is then driven by is_reasoning_model_for_stream, which returns SeparateField for reasoning providers and None otherwise. Neither is what they intended. A logging::warn (or even a debug log) when the configured value goes unrecognized would surface this during setup. Fix this →


2. InlineTags mode also consumes reasoning_content fields (chat.rs:2742–2751)

if reasoning_stream_style != ReasoningStreamStyle::None
    && let Some(reasoning) = reasoning_text.as_deref()
{
    push_thinking_delta(...);
}

The condition != None means InlineTags also routes a reasoning_content/reasoning delta to push_thinking_delta. For a gateway configured with inline_tags that also happens to emit reasoning_content (e.g. an OpenRouter passthrough with metadata), both paths activate: reasoning_content starts a Thinking block, then the content string with <think> tags is split by inline_reasoning_segments. The block-transition logic in push_text_delta/push_thinking_delta handles it correctly (no panic, no dropped content), but the downstream consumer sees more thinking-block starts than expected.

If the intent is "InlineTags gateways don't use the separate field", the condition should be == ReasoningStreamStyle::SeparateField. If the intent is "route both if present", that's fine but should be documented. Fix this →


3. Missing test: unclosed <think> tag at stream end (chat.rs:2891–2915)

The flush path (inline_reasoning_segments("", state, true) at finish_reason) correctly emits remaining pending content as Thinking when inside_think == true. But there is no test for this — a truncated stream ending with an open <think> and no matching </think>. Worth a dedicated #[test] that sends a single chunk with <think> content and no close tag, followed by a finish-reason chunk, asserting the partial reasoning is emitted rather than silently dropped. Fix this →


4. Module-level doc comment names a now-test-only function (chat.rs:4)

//! ... SSE parsing (`parse_sse_chunk`) all live here.

parse_sse_chunk is now #[cfg(test)]; the production entry point is parse_sse_chunk_with_reasoning_style. The doc comment should be updated to avoid confusion. Fix this →


False alarm from Gemini

The inline comment suggestion on config/tests.rs:4241 (use std::time::SystemTime::now() fully qualified) is not needed — SystemTime and UNIX_EPOCH are already imported at config/tests.rs:8 via use std::time::{SystemTime, UNIX_EPOCH}.


What's good

  • trailing_tag_prefix_len — the is_char_boundary guard handles UTF-8 content (e.g. CJK text before a tag) correctly; the O(tag.len()) scan is fine for 7–9 char tags.
  • push_reasoning_segment empty filter — prevents spurious ContentBlockStart events when a split produces an empty leading/trailing segment.
  • #[cfg(test)] shim on parse_sse_chunk — clean pattern; the old signature is a thin wrapper around parse_sse_chunk_with_reasoning_style and gated so it doesn't pollute production exports. All callers in client.rs are inside #[cfg(test)] mod tests, so nothing breaks.
  • Config merge at merge_provider_config — the new field is included correctly with or semantics (override wins, then base).
  • reasoning_stream_style_for_stream override priority — explicit config beats model/provider heuristic, which is the right default for OpenAI-compatible gateways.
  • Block transition logic in push_text_delta/push_thinking_delta — correctly stops the opposite block and increments content_index before opening the new one; no index bookkeeping bugs found.

Hmbown added 3 commits June 23, 2026 08:45
Add provider-scoped reasoning_stream_style handling for chat-completion streams so OpenAI-compatible routes can choose separate_field, inline_tags, or none without a new provider enum variant.

Inline <think> blocks are streamed into Thinking content blocks, including tags split across SSE chunks, while none preserves raw visible text and existing separate-field reasoning keeps its default behavior.

Refs #3222.
Limit the legacy parse_sse_chunk wrapper to test builds so release and clippy builds use only the style-aware stream parser path.

Refs #3222.
Warn when reasoning_stream_style contains an unrecognized configured value, keep inline_tags mode from also consuming separate reasoning_content fields, and cover unclosed <think> stream flushes.

Verified with:

- cargo fmt --all -- --check

- git diff --check

- cargo test -p codewhale-tui --bin codewhale-tui --locked reasoning_style

- cargo test -p codewhale-tui --bin codewhale-tui --locked provider_config_loads_reasoning_stream_style

- cargo test -p codewhale-tui --bin codewhale-tui --locked stream_decoder_tests

- cargo test -p codewhale-tui --bin codewhale-tui --locked qianfan_provider_accepts_custom_model_and_base_url

- cargo clippy -p codewhale-tui --bin codewhale-tui --locked -- -D warnings -A clippy::uninlined_format_args -A clippy::too_many_arguments -A clippy::unnecessary_map_or -A clippy::assertions_on_constants
@Hmbown Hmbown force-pushed the issue/3222-inline-reasoning-tags branch from 3636629 to 6900776 Compare June 23, 2026 15:50
@claude

claude Bot commented Jun 23, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 2m 41s —— View job


PR Review: Route inline reasoning streams (updated)

  • Gather context and understand request
  • Check what changed in latest commits vs previous review
  • Read updated diff and key files
  • Review correctness and logic
  • Check if previous findings were addressed
  • Post findings

Follow-up on Previous Findings

The two commits pushed since the initial review (7993d15 and 6900776) fully address all four issues that were flagged. Quick verification:

Finding Fix
Silent fallback on unrecognized config value logging::warn now emitted in reasoning_stream_style_for_stream when parse_reasoning_stream_style returns None
InlineTags also consuming reasoning_content Condition changed from != ReasoningStreamStyle::None== ReasoningStreamStyle::SeparateField; confirmed by new reasoning_style_inline_tags_ignores_separate_reasoning_field test
Missing test for unclosed <think> at stream end reasoning_style_inline_tags_flushes_unclosed_think_at_stream_end added; flush path confirmed correct
Module doc naming parse_sse_chunk (now #[cfg(test)]) Doc comment updated to parse_sse_chunk_with_reasoning_style

Fresh Review of Current State

The implementation looks clean. A few observations on the final shape:

State machine (inline_reasoning_segments) — the trailing_tag_prefix_len hold-back loop is correct, including the is_char_boundary guard for multi-byte content. The flush = true path correctly zero-holds and emits everything, so truncated streams don't silently drop content.

Match structure in parse_sse_chunk_with_reasoning_style — the four arms are exhaustive and non-overlapping:

  • (Some, InlineTags) → segment through inline_reasoning_segments
  • (Some, _) → plain text (covers SeparateField and None with a content field)
  • (None, None) → fall through reasoning_text as text (legacy passthrough gateways)
  • (None, _) → no-op (reasoning field already handled or not applicable)

This correctly preserves the legacy behavior for unknown gateways while adding the two new explicit modes.

parse_sse_chunk test shim — gated #[cfg(test)], creates a fresh InlineReasoningTagState per call (correct for the pre-existing tests that don't test inline tag cross-chunk state). New tests use decode_chunks_with_style which correctly carries state across chunks.

Configreasoning_stream_style stored as Option<String> on DeepSeekClient and parsed lazily at stream start; merge_provider_config uses or semantics (override wins). Both are correct.

Test coverage — 7 new tests cover: separate field routing, inline tag routing (including split across chunks), unclosed-tag flush, ignoring reasoning_content in InlineTags mode, None passthrough, override priority, and config deserialization.

Minor notes (not blocking):

  • parse_reasoning_stream_style accepts several undocumented aliases ("inline", "think_tags", "separate", "text", "off", etc.). The doc only mentions the canonical three. These won't cause user confusion, but if the aliases are intentional they're worth a note in CONFIGURATION.md.
  • The tag match is case-sensitive (<think> only). That's the de-facto standard for these gateways, so it's fine for the current scope.

No new correctness issues found. This is ready to merge.

@Hmbown Hmbown merged commit 2a4c67a into main Jun 23, 2026
15 checks passed
@Hmbown Hmbown deleted the issue/3222-inline-reasoning-tags branch June 23, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant